Are Blogs Edited? A Linguistic Survey of Italian Blogs Using Search Engines

نویسنده

  • Mirko Tavosanis
چکیده

Many blogs are written by people with no formal training in public writing; this could suggest a low level of editing and general correctness. A quantitative analysis of misspellings, however, shows that in their orthography Italian blogs are as well revised as conventional Italian newspaper texts. On the other hand, their editing is more careful than the editing of the average of Italian web pages. Context: an empirical grid The nature of the texts published on the Web is still poorly described from the linguistic viewpoint. References to the “informal” nature of all texts written for the Web can often still be found. This kind of view has been confuted in the past (see in particular Crystal 2001) but even in recent years descriptions of the linguistic features of real Web writing are scarce, even for blogs. In this survey we will take as a starting point for an orthographic analysis an empirical grid for the description of Web texts from the linguistic point of view presented in Tavosanis (2005). The grid is intended mainly as a compendium of words to enable linguists to speak more correctly about the texts published on the Web and to place particular phenomena in context. The whole classification, slightly different from current grids, is currently being tested on Italian Web pages; however, parts of it should also have wider implications and may be applied to other languages. The grid includes four layers of description directly related to the writing process. The first of these layers, “Time allowed for writing”, is constructed upon four main categories, related to specific types of texts: • fast unedited writing (forum postings and, occasionally, Web sites); includes text written without planning and without a second reading and/or correction. • fast revised writing (forum postings and, occasionally, Web sites); includes text written with some degree of planning and/correction. • conventional revised writing (Web sites and, occasionally, forum postings); includes text written within a process of planning and correction. • writing designed for other kinds of publishing (Web sites and, occasionally, forum postings); includes text written for other media mechanically copied and published on the Web. What kind of place can be allocated to blogs in this classification? A preliminary answer to this question will be provided in the Conclusion.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Linguistic Features Of Italian Blogs: Literary Language

Preliminary surveys show that the lan­ guage of blogs is not restricted to the more informal levels of expression. In­ stead blogs may include many kinds of written language: from simple personal notes to literary prose or poetry. The pa­ per presents a sample of Italian blogs and comments on the results of the search of literary forms in two Web corpora using search engine queries.

متن کامل

SVMs for the Blogosphere: Blog Identification and Splog Detection

Weblogs, or blogs have become an important new way to publish information, engage in discussions and form communities. The increasing popularity of blogs has given rise to search and analysis engines focusing on the “blogosphere”. A key requirement of such systems is to identify blogs as they crawl the Web. While this ensures that only blogs are indexed, blog search engines are also often overw...

متن کامل

Building general- and special-purpose corpora by Web crawling

The Web is a potentially unlimited source of linguistic data; however, commercial search engines are not the best way for linguists to gather data from it. In this paper, we present a procedure to build language corpora by crawling and postprocessing Web data. We describe the construction of a very large Italian general-purpose Web corpus (almost 2 billion words) and a specialized Japanese “blo...

متن کامل

A Comparing between the impacts of text based indexing and folksonomy on ranking of images search via Google search engine

Background and Aim: The purpose of this study was to compare the impact of text based indexing and folksonomy in image retrieval via Google search engine. Methods: This study used experimental method. The sample is 30 images extracted from the book “Gray anatomy”. The research was carried out in 4 stages; in the first stage, images were uploaded to an “Instagram” account so the images are tagge...

متن کامل

Personal Experience Acquisition Support from Blogs using Event-Depicting Pictures

Internet users often write blogs related to their personal experiences, daily news, and so on. We can obtain blogs including personal experiences using search engines on the Web. However, search engines output various blogs including not only personal experiences but also other topics. Therefore, it takes too much time to obtain personal experiences because we have to read all output blogs.. Th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006